RNA-Seq Count Data Modelling by Grey Relational Analysis and Nonparametric Gaussian Process

نویسندگان

  • Thanh Nguyen
  • Asim Bhatti
  • Samuel Yang
  • Saeid Nahavandi
چکیده

This paper introduces an approach to classification of RNA-seq read counts using grey relational analysis (GRA) and Bayesian Gaussian process (GP) models. Read counts are transformed to microarray-like data to facilitate normal-based statistical methods. GRA is designed to select differentially expressed genes by integrating outcomes of five individual feature selection methods including two-sample t-test, entropy test, Bhattacharyya distance, Wilcoxon test and receiver operating characteristic curve. GRA performs as an aggregate filter method through combining advantages of the individual methods to produce significant feature subsets that are then fed into a nonparametric GP model for classification. The proposed approach is verified by using two benchmark real datasets and the five-fold cross-validation method. Experimental results show the performance dominance of the GRA-based feature selection method as well as GP classifier against their competing methods. Moreover, the results demonstrate that GRA-GP considerably dominates the sparse Poisson linear discriminant analysis classifiers, which were introduced specifically for read counts, on different number of features. The proposed approach therefore can be implemented effectively in real practice for read count data analysis, which is useful in many applications including understanding disease pathogenesis, diagnosis and treatment monitoring at the molecular level.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimization of the injection molding process of Derlin 500 composite using ANOVA and grey relational analysis

Warpage and shrinkage control are important factors in proving the quality of thin-wall parts in injection modeling process. In the present paper, grey relational analysis was used in order to optimize these two parameters in manufacturing plastic bush of articulated garden tractor. The material used in the plastic bush is Derlin 500. The input parameters in the process were selected according ...

متن کامل

Dissimilar friction stir lap welding of Al-Mg to CuZn34: Application of grey relational analysis for optimizing process parameters

This study focused on the optimization of Al—Mg to CuZn34 friction stir lap welding (FSLW) process for optimal combination of rotational and traverse speeds in order to yield favorable fracture load using Grey relational analysis (GRA). First, the degree of freedom was calculated for the system. Then, the experiments based on the target values and number of considered levels, corresponding orth...

متن کامل

Optimization of gas metal arcwelding parameters of SS304 austenitic steel by Taguchi –Grey relational analysis

This study investigated the optimization of three welding parameters (wire feed speed, arc voltage, and shielding gas flow rate) for SS 304H by using Taguchi based Grey relational analysis. In this research work, pure argon was used as shielding gas. Numbers of trials were performed as per L16 (4xx3) orthogonal array design and the mechanical quality such ultimate tensile strength, microhardnes...

متن کامل

Multi-objective optimization in WEDM of D3 tool steel using integrated approach of Taguchi method & Grey relational analysis

In this paper, wire electrical discharge machining of D3 tool steel is studied. Influence of pulse-on time, pulse-off time, peak current and wire speed are investigated for MRR, dimensional deviation, gap current and machining time, during intricate machining of D3 tool steel. Taguchi method is used for single characteristics optimization and to optimize all four process parameters simultaneous...

متن کامل

Universal Count Correction for High-Throughput Sequencing

We show that existing RNA-seq, DNase-seq, and ChIP-seq data exhibit overdispersed per-base read count distributions that are not matched to existing computational method assumptions. To compensate for this overdispersion we introduce a nonparametric and universal method for processing per-base sequencing read count data called FIXSEQ. We demonstrate that FIXSEQ substantially improves the perfor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2016